127 research outputs found

    Legal Interoperability Issues in the Framework of the OpenMinTeD Project: A Methodological Overview

    Get PDF
    This paper is a first analysis of the legal interoperability issues in the framework of the OpenMinTeD (OMTD) project (www.openminted.eu), which aims to create an open, service-oriented e-Infrastructure for Text and Data Mining (TDM) of scientific and scholarly content. The paper offers an overview into the methods for achieving such interoperability

    Foundation of a Component-based Flexible Registry for Language Resources and Technology

    Get PDF
    Within the CLARIN e-science infrastructure project it is foreseen to develop a component-based registry for metadata for Language Resources and Language Technology. With this registry it is hoped to overcome the problems of the current available systems with respect to inflexible fixed schema, unsuitable terminology and interoperability problems. The registry will address interoperability needs by refering to a shared vocabulary registered in data category registries as they are suggested by ISO

    Introducing the Digital Language Equality Metric: Technological Factors

    Get PDF
    This paper introduces the concept of Digital Language Equality (DLE) developed by the EU-funded European Language Equality (ELE) project, and describes the associated DLE Metric with a focus on its technological factors (TFs), which are complemented by situational contextual factors. This work aims at objectively describing the level of technological support of all European languages and lays the foundation to implement a large-scale EU-wide programme to ensure that these languages can continue to exist and prosper in the digital age, to serve the present and future needs of their speakers. The paper situates this ongoing work with a strong European focus in the broader context of related efforts, and explains how the DLE Metric can help track the progress towards DLE for all languages of Europe, focusing in particular on the role played by the TFs. These are derived from the European Language Grid (ELG) Catalogue, that provides the empirical basis to measure the level of digital readiness of all European languages. The DLE Metric scores can be consulted through an online interactive dashboard to show the level of technological support of each European language and track the overall progress toward DLE

    A Metadata Schema for the Description ofLanguage Resources (LRs)

    Get PDF
    This paper presents the metadata schema for describing language resources (LRs) currently under development for the needs of META-SHARE, an open distributed facility for the exchange and sharing of LRs. An essential ingredient in its setup is the existence of formal and standardized LR descriptions, cornerstone of the interoperability layer of any such initiative. The description of LRs is granular and abstractive, combining the taxonomy of LRs with an inventory of a structured set of descriptive elements, of which only a minimal subset is obligatory; the schema additionally proposes recommended and optional elements. Moreover, the schema includes a set of relations catering for the appropriate inter-linking of resources. The current paper presents the main principles and features of the metadata schema, focusing on the description of text corpora and lexical / conceptual resources

    OpenMinTeD: A Platform Facilitating Text Mining of Scholarly Content

    Get PDF
    The OpenMinTeD platform aims to bring full text Open Access scholarly content from a wide range of providers together with Text and Data Mining (TDM) tools from various Natural Language Processing frameworks and TDM developers in an integrated environment. In this way, it supports users who want to mine scientific literature with easy access to relevant content and allows running scalable TDM workflows in the cloud

    Definition of interfaces

    Get PDF
    The aim of this report is to define the interfaces for the tools used in the MT development and evaluation scenarios as included in the QTLaunchPad (QTLP) infrastructure. Specification of the interfaces is important for the interaction and interoperability of the tools in the developed QTLP infrastructure. In addressing this aim, the report provides: 1. Descriptions of the common aspects of the tools and their standardized data formats; 2. Descriptions of the interfaces for the tools for interoperability. where the tools are categorized into preparation, development, and evaluation categories including the human interfaces for quality assessment with multidimensional quality metrics. Interface specifications allow a modular tool infrastructure, flexibly selecting among alternative implementations, enabling realistic expectations to be made at different sections of the QTLP information flow pipeline, and supporting the QTLP infrastructure. D3.2.1 allows the emergence of the QTLP infrastructure and helps the identification and acquisition of existing tools (D4.4.1), the integration of identified language processing tools (D3.3.1), their implementation (D3.4.1), and their testing (D3.5.1). QTLP infrastructure will facilitate the organization and running of the quality translation shared task (D5.2.1). We also provide human interfaces for translation quality assessment with the multidimensional quality metrics (D1.1.1). D3.2.1 is a living document until M12, which is when the identification and acquisition of existing tools (D4.4.1) and the implementation of identified language processing tools (D3.4.1) are due

    Documentation and User Manual of the META-SHARE Metadata Model

    Get PDF
    The current deliverable presents the META-SHARE metadata schema v1.0, as implemented in the META-SHARE XSD\u27s v1.0 released to (META-NET and PSP partners) in July 2011 for text corpora and lexical/conceptual resources and its supplement for audio corpora, tools and language descriptions (simplified/refactored version) as implemented in November. It is meant to act as a user manual, providing explanations on the model contents for LRs providers and LRs curators that wish to describe their resources in accordance to it. Work on the schema is ongoing and changes/updates to the model are constantly being made; where appropriate, some changes that are already under way are documented in this deliverable

    Introducing the digital language equality metric: technological factors

    Get PDF
    This paper introduces the concept of Digital Language Equality (DLE) developed by the EU-funded European Language Equality (ELE) project, and describes the associated DLE Metric with a focus on its technological factors (TFs), which are complemented by situational contextual factors. This work aims at objectively describing the level of technological support of all European languages and lays the foundation to implement a large-scale EU-wide programme to ensure that these languages can continue to exist and prosper in the digital age, to serve the present and future needs of their speakers. The paper situates this ongoing work with a strong European focus in the broader context of related efforts, and explains how the DLE Metric can help track the progress towards DLE for all languages of Europe, focusing in particular on the role played by the TFs. These are derived from the European Language Grid (ELG) Catalogue, that provides the empirical basis to measure the level of digital readiness of all European languages. The DLE Metric scores can be consulted through an online interactive dashboard to show the level of technological support of each European language and track the overall progress toward DLE

    The strategic impact of META-NET on the regional, national and international level

    Get PDF
    This article provides an overview of the dissemination work carried out in META-NET from 2010 until 2015; we describe its impact on the regional, national and international level, mainly with regard to politics and the funding situation for LT topics. The article documents the initiative's work throughout Europe in order to boost progress and innovation in our field.Peer ReviewedPostprint (author's final draft

    Digital language equality: definition, metric, dashboard

    Get PDF
    This chapter presents the concept of Digital Language Equality (DLE) that was at the heart of the European Language Equality (ELE) initiative, and describes the DLE Metric, which includes technological factors (TFs) and contextual factors (CFs): the former concern the availability of Language Resources and Technologies (LRTs) for the languages of Europe, based on the data included in the European Language Grid (ELG) catalogue, while the latter reflect the broader socio-economic contexts and ecosystems of the languages, as these determine the potential for LRT development. The chapter discusses related work, presents the DLE definition and describes how it was implemented through the DLE Metric, explaining how the TFs and CFs were quantified. The resulting scores of the DLE Metric for Europe’s languages can be visualised and compared through the interactive DLE dashboard, to monitor the progress towards DLE in Europe
    corecore